Brazoria County
Generative Multi-Objective Bayesian Optimization with Scalable Batch Evaluations for Sample-Efficient De Novo Molecular Design
Muthyala, Madhav R., Sorourifar, Farshud, Tan, Tianhong, Peng, You, Paulson, Joel A.
Designing molecules that must satisfy multiple, often conflicting objectives is a central challenge in molecular discovery. The enormous size of chemical space and the cost of high-fidelity simulations have driven the development of machine learning-guided strategies for accelerating design with limited data. Among these, Bayesian optimization (BO) offers a principled framework for sample-efficient search, while generative models provide a mechanism to propose novel, diverse candidates beyond fixed libraries. However, existing methods that couple the two often rely on continuous latent spaces, which introduces both architectural entanglement and scalability challenges. This work introduces an alternative, modular "generate-then-optimize" framework for de novo multi-objective molecular design/discovery. At each iteration, a generative model is used to construct a large, diverse pool of candidate molecules, after which a novel acquisition function, qPMHI (multi-point Probability of Maximum Hypervolume Improvement), is used to optimally select a batch of candidates most likely to induce the largest Pareto front expansion. The key insight is that qPMHI decomposes additively, enabling exact, scalable batch selection via only simple ranking of probabilities that can be easily estimated with Monte Carlo sampling. We benchmark the framework against state-of-the-art latent-space and discrete molecular optimization methods, demonstrating significant improvements across synthetic benchmarks and application-driven tasks. Specifically, in a case study related to sustainable energy storage, we show that our approach quickly uncovers novel, diverse, and high-performing organic (quinone-based) cathode materials for aqueous redox flow battery applications.
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Texas > Brazoria County > Lake Jackson (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Geological Inference from Textual Data using Word Embeddings
Linphrachaya, Nanmanas, Gómez-Méndez, Irving, Siripatana, Adil
This research explores the use of Natural Language Processing (NLP) techniques to locate geological resources, with a specific focus on industrial minerals. By using word embeddings trained with the GloVe model, we extract semantic relationships between target keywords and a corpus of geological texts. The text is filtered to retain only words with geographical significance, such as city names, which are then ranked by their cosine similarity to the target keyword. Dimensional reduction techniques, including Principal Component Analysis (PCA), Autoencoder, Variational Autoencoder (VAE), and VAE with Long Short-Term Memory (VAE-LSTM), are applied to enhance feature extraction and improve the accuracy of semantic relations. For benchmarking, we calculate the proximity between the ten cities most semantically related to the target keyword and identified mine locations using the haversine equation. The results demonstrate that combining NLP with dimensional reduction techniques provides meaningful insights into the spatial distribution of natural resources. Although the result shows to be in the same region as the supposed location, the accuracy has room for improvement.
- Europe > United Kingdom (0.05)
- Asia > Indonesia > Java > Jakarta > Jakarta (0.05)
- North America > Canada > British Columbia (0.04)
- (32 more...)
- Energy (0.94)
- Materials > Metals & Mining > Lithium (0.50)
EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
Zhuang, Ziyuan, Zhang, Zhiyang, Cheng, Sitao, Yang, Fangkai, Liu, Jia, Huang, Shujian, Lin, Qingwei, Rajmohan, Saravan, Zhang, Dongmei, Zhang, Qi
Retrieval-augmented generation (RAG) methods encounter difficulties when addressing complex questions like multi-hop queries. While iterative retrieval methods improve performance by gathering additional information, current approaches often rely on multiple calls of large language models (LLMs). In this paper, we introduce EfficientRAG, an efficient retriever for multi-hop question answering. EfficientRAG iteratively generates new queries without the need for LLM calls at each iteration and filters out irrelevant information. Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets.
- Asia > Singapore (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > Texas > Montgomery County > Conroe (0.04)
- (12 more...)
- Leisure & Entertainment (1.00)
- Media > Film (0.46)
Variational Bayesian Optimal Experimental Design with Normalizing Flows
Dong, Jiayuan, Jacobsen, Christian, Khalloufi, Mehdi, Akram, Maryam, Liu, Wanjiao, Duraisamy, Karthik, Huan, Xun
Bayesian optimal experimental design (OED) seeks experiments that maximize the expected information gain (EIG) in model parameters. Directly estimating the EIG using nested Monte Carlo is computationally expensive and requires an explicit likelihood. Variational OED (vOED), in contrast, estimates a lower bound of the EIG without likelihood evaluations by approximating the posterior distributions with variational forms, and then tightens the bound by optimizing its variational parameters. We introduce the use of normalizing flows (NFs) for representing variational distributions in vOED; we call this approach vOED-NFs. Specifically, we adopt NFs with a conditional invertible neural network architecture built from compositions of coupling layers, and enhanced with a summary network for data dimension reduction. We present Monte Carlo estimators to the lower bound along with gradient expressions to enable a gradient-based simultaneous optimization of the variational parameters and the design variables. The vOED-NFs algorithm is then validated in two benchmark problems, and demonstrated on a partial differential equation-governed application of cathodic electrophoretic deposition and an implicit likelihood case with stochastic modeling of aphid population. The findings suggest that a composition of 4--5 coupling layers is able to achieve lower EIG estimation bias, under a fixed budget of forward model runs, compared to previous approaches. The resulting NFs produce approximate posteriors that agree well with the true posteriors, able to capture non-Gaussian and multi-modal features effectively.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Texas > Brazoria County > Lake Jackson (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Generative AI for Education (GAIED): Advances, Opportunities, and Challenges
Denny, Paul, Gulwani, Sumit, Heffernan, Neil T., Käser, Tanja, Moore, Steven, Rafferty, Anna N., Singla, Adish
This survey article has grown out of the GAIED (pronounced "guide") workshop organized by the authors at the NeurIPS 2023 conference. We organized the GAIED workshop as part of a community-building effort to bring together researchers, educators, and practitioners to explore the potential of generative AI for enhancing education. This article aims to provide an overview of the workshop activities and highlight several future research directions in the area of GAIED.
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Africa > Sierra Leone (0.04)
- North America > United States > Texas > Fort Bend County (0.04)
- (4 more...)
- Overview (1.00)
- Instructional Material > Course Syllabus & Notes (0.88)
- Education > Educational Setting (1.00)
- Education > Curriculum > Subject-Specific Education (0.95)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.83)
Enhancing Dynamical System Modeling through Interpretable Machine Learning Augmentations: A Case Study in Cathodic Electrophoretic Deposition
Jacobsen, Christian, Dong, Jiayuan, Khalloufi, Mehdi, Huan, Xun, Duraisamy, Karthik, Akram, Maryam, Liu, Wanjiao
We introduce a comprehensive data-driven framework aimed at enhancing the modeling of physical systems, employing inference techniques and machine learning enhancements. As a demonstrative application, we pursue the modeling of cathodic electrophoretic deposition (EPD), commonly known as e-coating. Our approach illustrates a systematic procedure for enhancing physical models by identifying their limitations through inference on experimental data and introducing adaptable model enhancements to address these shortcomings. We begin by tackling the issue of model parameter identifiability, which reveals aspects of the model that require improvement. To address generalizability , we introduce modifications which also enhance identifiability. However, these modifications do not fully capture essential experimental behaviors. To overcome this limitation, we incorporate interpretable yet flexible augmentations into the baseline model. These augmentations are parameterized by simple fully-connected neural networks (FNNs), and we leverage machine learning tools, particularly Neural Ordinary Differential Equations (Neural ODEs), to learn these augmentations. Our simulations demonstrate that the machine learning-augmented model more accurately captures observed behaviors and improves predictive accuracy. Nevertheless, we contend that while the model updates offer superior performance and capture the relevant physics, we can reduce off-line computational costs by eliminating certain dynamics without compromising accuracy or interpretability in downstream predictions of quantities of interest, particularly film thickness predictions. The entire process outlined here provides a structured approach to leverage data-driven methods. Firstly, it helps us comprehend the root causes of model inaccuracies, and secondly, it offers a principled method for enhancing model performance.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Texas > Brazoria County > Lake Jackson (0.04)
- North America > United States > Michigan > Wayne County > Dearborn (0.04)
- (2 more...)
- Materials > Chemicals (0.46)
- Automobiles & Trucks (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Machine Learning and Computer Vision Techniques in Continuous Beehive Monitoring Applications: A survey
Bilik, Simon, Zemcik, Tomas, Kratochvila, Lukas, Ricanek, Dominik, Richter, Milos, Zambanini, Sebastian, Horak, Karel
Wide use and availability of the machine learning and computer vision techniques allows development of relatively complex monitoring systems in many domains. Besides the traditional industrial domain, new application appears also in biology and agriculture, where we could speak about the detection of infections, parasites and weeds, but also about automated monitoring and early warning systems. This is also connected with the introduction of the easily accessible hardware and development kits such as Arduino, or RaspberryPi family. In this paper, we survey 50 existing papers focusing on the methods of automated beehive monitoring methods using the computer vision techniques, particularly on the pollen and Varroa mite detection together with the bee traffic monitoring. Such systems could also be used for the monitoring of the honeybee colonies and for the inspection of their health state, which could identify potentially dangerous states before the situation is critical, or to better plan periodic bee colony inspections and therefore save significant costs. Later, we also include analysis of the research trends in this application field and we outline the possible direction of the new explorations. Our paper is aimed also at veterinary and apidology professionals and experts, who might not be familiar with machine learning to introduce them to its possibilities, therefore each family of applications is opened by a brief theoretical introduction and motivation related to its base method. We hope that this paper will inspire other scientists to use machine learning techniques for other applications in beehive monitoring.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > Czechia > South Moravian Region > Brno (0.04)
- South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
- (10 more...)
- Health & Medicine > Consumer Health (1.00)
- Food & Agriculture > Agriculture (0.88)
Dynamic Documentation for AI Systems
Mehta, Soham, Rogers, Anderson, Gilbert, Thomas Krendl
AI documentation is a rapidly-growing channel for coordinating the design of AI technologies with policies for transparency and accessibility. Calls to standardize and enact documentation of algorithmic harms and impacts are now commonplace. However, documentation standards for AI remain inchoate, and fail to match the capabilities and social effects of increasingly impactful architectures such as Large Language Models (LLMs). In this paper, we show the limits of present documentation protocols, and argue for dynamic documentation as a new paradigm for understanding and evaluating AI systems. We first review canonical approaches to system documentation outside the context of AI, focusing on the complex history of Environmental Impact Statements (EISs). We next compare critical elements of the EIS framework to present challenges with algorithmic documentation, which have inherited the limitations of EISs without incorporating their strengths. These challenges are specifically illustrated through the growing popularity of Model Cards and two case studies of algorithmic impact assessment in China and Canada. Finally, we evaluate more recent proposals, including Reward Reports, as potential components of fully dynamic AI documentation protocols.
- Asia > China (0.36)
- North America > Canada (0.25)
- North America > United States > Texas > Brazoria County (0.04)
- Law > Environmental Law (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy (1.00)
The Morning After: Testing the best budget robot vacuums
I bought my first Roomba more than a decade ago. Unfortunately, cheap robovacs back then didn't have much in the way of intelligence or suction power, so it was mostly a curiosity that would bump around my apartment for a while before I followed up with a real vacuum. Whether you're like me, or you've never had one, then you're probably wondering if current models can do any better, and according to Valentina Palladino, the answer is yes. As she explains, "If you're someone who wants to spend as little time as possible cleaning your home -- or just someone who detests vacuuming -- then a semi-autonomous robot is a great investment." She reviewed several options that are available for under $300 and picked out a few that are worthy of slinking under your couch or into dusty corners.
Coordinating Disaster Emergency Response with Heuristic Reinforcement Learning
Nguyen, Long, Yang, Zhou, Zhu, Jiazhen, Li, Jia, Jin, Fang
Abstract--A crucial and time-sensitive task when any disaster occurs is to rescue victims and distribute resources to the right groups and locations. This task is challenging in populated urban areas, due to the huge burst of help requests generated in a very short period. To improve the efficiency of the emergency response in the immediate aftermath of a disaster, we propose a heuristic multi-agent reinforcement learning scheduling algorithm, named as ResQ, which can effectively schedule the rapid deployment of volunteers to rescue victims in dynamic settings. The core concept is to quickly identify victims and volunteers from social network data and then schedule rescue parties with an adaptive learning algorithm. This framework performs two key functions: 1) identify trapped victims and rescue volunteers, and 2) optimize the volunteers' rescue strategy in a complex time-sensitive environment. The proposed ResQ algorithm can speed up the training processes through a heuristic function which reduces the state-action space by identifying the set of particular actions over others. Experimental results showed that the proposed heuristic multi-agent reinforcement learning based scheduling outperforms several state-of-art methods, in terms of both reward rate and response times. Natural disasters have always posed a critical threat to human beings, often being accompanied by major loss of life and property damage. In recent years, we have witnessed more frequent and intense natural disasters all over the world.
- North America > Haiti (0.28)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Texas > Brazoria County > Pearland (0.04)
- (3 more...)